AITopics | jacobian term

Relative gradient optimization of the Jacobian term in unsupervised deep learning

Neural Information Processing SystemsDec-24-2025, 13:27:01 GMT

Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning. A popular approach for solving it is mapping the observations into a representation space with a simple joint distribution, which can typically be written as a product of its marginals -- thus drawing a connection with the field of nonlinear independent component analysis. Deep density models have been widely used for this task, but their maximum likelihood based training requires estimating the log-determinant of the Jacobian and is computationally expensive, thus imposing a trade-off between computation and expressive power. In this work, we propose a new approach for exact training of such neural networks. Based on relative gradients, we exploit the matrix structure of neural network parameters to compute updates efficiently even in high-dimensional spaces; the computational cost of the training is quadratic in the input size, in contrast with the cubic scaling of naive approaches. This allows fast training with objective functions involving the log-determinant of the Jacobian, without imposing constraints on its structure, in stark contrast to autoregressive normalizing flows.

jacobian term, relative gradient optimization, unsupervised deep learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.60)

Add feedback

Relative gradient optimization of the Jacobian term in unsupervised deep learning Luigi Gresele

Neural Information Processing SystemsNov-15-2025, 06:06:56 GMT

Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
North America > Canada (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

). Reviewers also praised the novelty

Neural Information Processing SystemsNov-15-2025, 06:06:46 GMT

We thank the reviewers for their comments and the largely positive feedback. Reviewers agree that " the paper clearly The improvement our approach provides " is demonstrated by experiments " The contribution was praised as " elegant ", Rigorous formulation and convergence properties of relative gradient: We will add more details on this. We will include these references in the paper. These architectures have several limitations, e.g. they We will include this discussion and reference in the paper. R6: T oo much emphasis on existing concepts, too little on the proposed approach: We will try to balance this.

artificial intelligence, gradient, relative gradient, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.33)

Add feedback

c10f48884c9c7fdbd9a7959c59eebea8-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 05:14:35 GMT

gradient, neural network, relative gradient, (14 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

c10f48884c9c7fdbd9a7959c59eebea8-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 05:14:23 GMT

contribution, gradient, relative gradient, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.33)

Add feedback

Review for NeurIPS paper: Relative gradient optimization of the Jacobian term in unsupervised deep learning

Neural Information Processing SystemsFeb-5-2025, 12:14:47 GMT

Summary and Contributions: Quite a bit of recent research on deep density estimation under the normalizing flows umbrella has focused on efficiently computing (a restricted form of) the Jacobian term that appears in the objective. Such models operate with a set of transformations where the computation of this term is easy. While arbitrary distributions can be learned by such methods, the features that are learned are quite skewered which can prevent learning a proper disentangled representation. This paper presents a conceptually simple method to optimize for exact maximum likelihood in such models. In particular, the authors consider a transform from the observed to the latent space which is parameterized by fully connected networks with the only constraint that the weight matrices are invertible. Since the parameters of the transformation are matrices, the authors use properties of Riemannian geometry of matrix spaces to derive updates in terms of the relative gradient.

gradient, jacobian term, relative gradient optimization, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Review for NeurIPS paper: Relative gradient optimization of the Jacobian term in unsupervised deep learning

Neural Information Processing SystemsFeb-5-2025, 12:14:47 GMT

The focus of the work is deep density estimation (also called normalizing flows). Particularly, the authors focus on the generative model x f(s) as defined in (1) where the observation (x) is described as the invertible non-linear function (f) of a latent variable (s). They take a maximum-likelihood perspective (2) where g_{\theta}, the approximation of the inverse of f, is the composition of g_1 \sigma_1(W_1 \cdot), ..., g_L \sigma_L(W_L \cdot) invertible and differentiable component functions. They propose to use the relative gradient method to optimize \theta to speed up computations. Deep density estimation is an important problem in machine learning.

neurips paper, relative gradient optimization, unsupervised deep learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Relative gradient optimization of the Jacobian term in unsupervised deep learning

Neural Information Processing SystemsOct-11-2024, 06:22:24 GMT

Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning. A popular approach for solving it is mapping the observations into a representation space with a simple joint distribution, which can typically be written as a product of its marginals -- thus drawing a connection with the field of nonlinear independent component analysis. Deep density models have been widely used for this task, but their maximum likelihood based training requires estimating the log-determinant of the Jacobian and is computationally expensive, thus imposing a trade-off between computation and expressive power. In this work, we propose a new approach for exact training of such neural networks. Based on relative gradients, we exploit the matrix structure of neural network parameters to compute updates efficiently even in high-dimensional spaces; the computational cost of the training is quadratic in the input size, in contrast with the cubic scaling of naive approaches.

jacobian term, relative gradient optimization, unsupervised deep learning

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Relative gradient optimization of the Jacobian term in unsupervised deep learning

Gresele, Luigi, Fissore, Giancarlo, Javaloy, Adrián, Schölkopf, Bernhard, Hyvärinen, Aapo

arXiv.org Machine LearningOct-26-2020

Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning. A popular approach for solving it is mapping the observations into a representation space with a simple joint distribution, which can typically be written as a product of its marginals -- thus drawing a connection with the field of nonlinear independent component analysis. Deep density models have been widely used for this task, but their maximum likelihood based training requires estimating the log-determinant of the Jacobian and is computationally expensive, thus imposing a trade-off between computation and expressive power. In this work, we propose a new approach for exact training of such neural networks. Based on relative gradients, we exploit the matrix structure of neural network parameters to compute updates efficiently even in high-dimensional spaces; the computational cost of the training is quadratic in the input size, in contrast with the cubic scaling of naive approaches. This allows fast training with objective functions involving the log-determinant of the Jacobian, without imposing constraints on its structure, in stark contrast to autoregressive normalizing flows.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2006.1509

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Filters

Collaborating Authors

jacobian term

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Relative gradient optimization of the Jacobian term in unsupervised deep learning

Relative gradient optimization of the Jacobian term in unsupervised deep learning Luigi Gresele

). Reviewers also praised the novelty

c10f48884c9c7fdbd9a7959c59eebea8-Paper.pdf

c10f48884c9c7fdbd9a7959c59eebea8-AuthorFeedback.pdf

Review for NeurIPS paper: Relative gradient optimization of the Jacobian term in unsupervised deep learning

Review for NeurIPS paper: Relative gradient optimization of the Jacobian term in unsupervised deep learning

Relative gradient optimization of the Jacobian term in unsupervised deep learning

Relative gradient optimization of the Jacobian term in unsupervised deep learning